Virat Kohli Performance Analysis¶

This project is about classifying whether or not patient has he

In [3]:
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
In [4]:
df=pd.read_csv('Virat_Kohli_ODI.csv')
In [5]:
print(df)
    Runs Mins   BF  4s  6s      SR  Pos Dismissal  Inns     Opposition  \
0     12   33   22   1   0   54.54    2       lbw     1    v Sri Lanka   
1     37   82   67   6   0   55.22    2    caught     2    v Sri Lanka   
2     25   40   38   4   0   65.78    1   run out     1    v Sri Lanka   
3     54   87   66   7   0   81.81    1    bowled     1    v Sri Lanka   
4     31   45   46   3   1   67.39    1       lbw     2    v Sri Lanka   
..   ...  ...  ...  ..  ..     ...  ...       ...   ...            ...   
127   45   64   51   2   1   88.23    3    caught     2  v New Zealand   
128   65  152   76   2   1   85.52    3    caught     1  v New Zealand   
129  122  147  105   8   5  116.19    3    caught     2      v England   
130    8    6    5   2   0     160    3    caught     1      v England   
131   55   81   63   8   0    87.3    3    caught     2      v England   

            Ground Start Date  
0         Dambulla  18-Aug-08  
1         Dambulla  20-Aug-08  
2    Colombo (RPS)  24-Aug-08  
3    Colombo (RPS)  27-Aug-08  
4    Colombo (RPS)  29-Aug-08  
..             ...        ...  
127         Ranchi  26-Oct-16  
128  Visakhapatnam  29-Oct-16  
129           Pune  15-Jan-17  
130        Cuttack  19-Jan-17  
131        Kolkata  22-Jan-17  

[132 rows x 12 columns]
In [6]:
data=pd.read_csv('Virat_Kohli_ODI.csv')
In [7]:
print(data)
    Runs Mins   BF  4s  6s      SR  Pos Dismissal  Inns     Opposition  \
0     12   33   22   1   0   54.54    2       lbw     1    v Sri Lanka   
1     37   82   67   6   0   55.22    2    caught     2    v Sri Lanka   
2     25   40   38   4   0   65.78    1   run out     1    v Sri Lanka   
3     54   87   66   7   0   81.81    1    bowled     1    v Sri Lanka   
4     31   45   46   3   1   67.39    1       lbw     2    v Sri Lanka   
..   ...  ...  ...  ..  ..     ...  ...       ...   ...            ...   
127   45   64   51   2   1   88.23    3    caught     2  v New Zealand   
128   65  152   76   2   1   85.52    3    caught     1  v New Zealand   
129  122  147  105   8   5  116.19    3    caught     2      v England   
130    8    6    5   2   0     160    3    caught     1      v England   
131   55   81   63   8   0    87.3    3    caught     2      v England   

            Ground Start Date  
0         Dambulla  18-Aug-08  
1         Dambulla  20-Aug-08  
2    Colombo (RPS)  24-Aug-08  
3    Colombo (RPS)  27-Aug-08  
4    Colombo (RPS)  29-Aug-08  
..             ...        ...  
127         Ranchi  26-Oct-16  
128  Visakhapatnam  29-Oct-16  
129           Pune  15-Jan-17  
130        Cuttack  19-Jan-17  
131        Kolkata  22-Jan-17  

[132 rows x 12 columns]
In [8]:
data["Runs"] = data["Runs"].str.replace("*", "")
data["Runs"]
/var/folders/01/jggr45n103z6_1t9mbhr0m040000gn/T/ipykernel_1957/48377316.py:1: FutureWarning: The default value of regex will change from True to False in a future version. In addition, single character regular expressions will *not* be treated as literal strings when regex=True.
  data["Runs"] = data["Runs"].str.replace("*", "")
Out[8]:
0       12
1       37
2       25
3       54
4       31
      ... 
127     45
128     65
129    122
130      8
131     55
Name: Runs, Length: 132, dtype: object
In [9]:
data["Runs"] = data["Runs"].astype(int)
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 132 entries, 0 to 131
Data columns (total 12 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   Runs        132 non-null    int64 
 1   Mins        132 non-null    object
 2   BF          132 non-null    int64 
 3   4s          132 non-null    int64 
 4   6s          132 non-null    int64 
 5   SR          132 non-null    object
 6   Pos         132 non-null    int64 
 7   Dismissal   132 non-null    object
 8   Inns        132 non-null    int64 
 9   Opposition  132 non-null    object
 10  Ground      132 non-null    object
 11  Start Date  132 non-null    object
dtypes: int64(6), object(6)
memory usage: 12.5+ KB
In [10]:
total_runs = data["Runs"].sum()
total_runs
Out[10]:
6184

normally in ODI's 35-37 is considered a good average

In [11]:
data['Runs'].mean()
Out[11]:
46.84848484848485

Now let us look at the trend of runs scored by him

In [12]:
matches=data.index
figure=px.line(data,x=matches,y='Runs',title='Runs scored by virat kohli between 18-Aug-08 - 22-Jan-17',template='plotly_dark')
figure.show()

In some matches kohli has scored more than hundred or near to century so based on his batting positions analysing his performance virat%20kohli.jpeg

In [13]:
data['Pos']=data['Pos'].map({3.0: "Batting At 3", 4.0: "Batting At 4", 2.0: "Batting At 2", 
                               1.0: "Batting At 1", 7.0:"Batting At 7", 5.0:"Batting At 5", 
                               6.0: "batting At 6"})
Pos=data["Pos"].value_counts()
label = Pos.index
counts = Pos.values
colors = ["gold","lightgreen", "pink", "blue", "skyblue", "cyan", "orange"]

fig = go.Figure(data=[go.Pie(labels=label, values=counts)])
fig.update_layout(title_text='Number of Matches At Different Batting Positions')
fig.update_traces(hoverinfo='label+percent', textinfo='value', textfont_size=30,
                  marker=dict(colors=colors, line=dict(color='black', width=3)))
fig.show()

68.9% of all innings played by virat kohli he batted in third position. now lets calculate total runs scored by virat kohli in different positions

In [14]:
label = data["Pos"]
counts = data["Runs"]
colors = ['gold','lightgreen', "pink", "blue", "skyblue", "cyan", "orange"]

fig = go.Figure(data=[go.Pie(labels=label, values=counts)])
fig.update_layout(title_text='Runs By Virat Kohli At Different Batting Positions')
fig.update_traces(hoverinfo='label+percent', textinfo='value', textfont_size=30,
                  marker=dict(colors=colors, line=dict(color='black', width=3)))
fig.show()

72.4% of his runs are scored at batting position 3. From this we can 3rd position is best for him

Now let us look at the centuries scored by virat kohli in second and first innings

In [15]:
centuries = data.query("Runs >= 100")
figure = px.bar(centuries, x=centuries["Inns"], y = centuries["Runs"], 
                color = centuries["Runs"],
                title="Centuries By Virat Kohli in First Innings Vs. Second Innings")
figure.show()

centuries are scored while batting in the second innings.

Dismissals virat kohli faced

In [16]:
dismissal = data["Dismissal"].value_counts()
label = dismissal.index
counts = dismissal.values
colors = ['gold','lightgreen', "pink", "blue", "skyblue", "cyan", "orange"]

fig = go.Figure(data=[go.Pie(labels=label, values=counts)])
fig.update_layout(title_text='Dismissals of Virat Kohli')
fig.update_traces(hoverinfo='label+percent', textinfo='value', textfont_size=30,
                  marker=dict(colors=colors, line=dict(color='black', width=3)))
fig.show()

Kohli wicket is mostly because of the fielder or the keeper. Let us understand against which team kohli scored most of his runs

In [17]:
figure = px.bar(data, x=data["Opposition"], y = data["Runs"], color = data["Runs"],
            title="Most Runs Against Teams")
figure.show()

Virat Kohli likes batting against Sri Lanka, Australia, New Zealand, West Indies, and England.But he scored most of his runs while batting against Sri Lanka

Now let’s have a look at against which team Virat Kohli scored most of his centuries

In [18]:
figure=px.bar(centuries,x=centuries['Opposition'],y=centuries['Runs'],color = centuries["Runs"],
                title="Most Centuries Against Teams")
figure.show()

Most of the centuries were scored against Australia.

Taking strike rate into consideration we need to create a new dataset of all the matches played by virat kohli where his strike rate is more than 120.

In [19]:
data["SR"] = data["SR"].str.replace("-", " ")
print(data)
     Runs Mins   BF  4s  6s      SR           Pos Dismissal  Inns  \
0      12   33   22   1   0   54.54  Batting At 2       lbw     1   
1      37   82   67   6   0   55.22  Batting At 2    caught     2   
2      25   40   38   4   0   65.78  Batting At 1   run out     1   
3      54   87   66   7   0   81.81  Batting At 1    bowled     1   
4      31   45   46   3   1   67.39  Batting At 1       lbw     2   
..    ...  ...  ...  ..  ..     ...           ...       ...   ...   
127    45   64   51   2   1   88.23  Batting At 3    caught     2   
128    65  152   76   2   1   85.52  Batting At 3    caught     1   
129   122  147  105   8   5  116.19  Batting At 3    caught     2   
130     8    6    5   2   0     160  Batting At 3    caught     1   
131    55   81   63   8   0    87.3  Batting At 3    caught     2   

        Opposition         Ground Start Date  
0      v Sri Lanka       Dambulla  18-Aug-08  
1      v Sri Lanka       Dambulla  20-Aug-08  
2      v Sri Lanka  Colombo (RPS)  24-Aug-08  
3      v Sri Lanka  Colombo (RPS)  27-Aug-08  
4      v Sri Lanka  Colombo (RPS)  29-Aug-08  
..             ...            ...        ...  
127  v New Zealand         Ranchi  26-Oct-16  
128  v New Zealand  Visakhapatnam  29-Oct-16  
129      v England           Pune  15-Jan-17  
130      v England        Cuttack  19-Jan-17  
131      v England        Kolkata  22-Jan-17  

[132 rows x 12 columns]
In [24]:
figure = px.bar(data, x = data["Inns"], 
                y = data["SR"], 
                color = data["SR"],
            title="Virat Kohli's High Strike Rates in First Innings Vs. Second Innings")
figure.show()
In [27]:
figure = px.scatter(data_frame = data, x="Runs",
                    y="4s",
                    title="Relationship Between Runs Scored and Fours")
figure.show()
In [28]:
figure = px.scatter(data_frame = data, x="Runs",
                    y="6s",  
                    title= "Relationship Between Runs Scored and Sixes")
figure.show()
In [ ]:
 
In [ ]: